Hive Joins

JOINS in SQL are commands which are used to combine rows from two or more tables, based on a related column between those tables. There are predominantly used when a user is trying to extract data from tables which have one-to-many or many-to-many relationships between them.
  • Inner Join
  • Full Join
  • Left Join
  • Right Join
Types Of Joins In SQL - SQL Joins - Edureka

Left Outer Join

 select

 select
 table1. col1
,table1.col2
,table2.col1
,table2.col2
from table1
right outer join table2
on table1.matching_col = table2.matching_col;

Full Outer Join
 select
 table1. col1
,table1.col2
,table2.col1
,table2.col2
from table1
full outer join table2
on table1.matching_col = table2.matching_col;


INNER JOIN – Select records that have matching values in both tables.
LEFT JOIN (LEFT OUTER JOIN) – returns all the values from the left table, plus the matched values from the right table, or NULL in case of no matching join predicate
RIGHT JOIN (RIGHT OUTER JOIN) A RIGHT JOIN returns all the values from the right table, plus the matched values from the left table, or NULL in case of no matching join predicate
FULL JOIN (FULL OUTER JOIN) – Selects all records that match either left or right table records.
LEFT SEMI JOIN: Only returns the records from the left-hand table. Hive doesn’t support IN subqueries so you can’t do




TableA:

+----+---------+
| Id |  Data      |
+----+---------+
|  1 | DataA11 |
|  1 | DataA12 |
|  1 | DataA13 |
|  2 | DataA21 |
|  3 | DataA31 |
+----+---------+
TableB:

+----+---------+
| Id |  Data      |
+----+---------+
|  1 | DataB11 |
|  2 | DataB21 |
|  2 | DataB22 |
|  2 | DataB23 |
|  4 | DataB41 |
+----+---------+
Inner Join on column Id will return columns from both the tables and only the matching records:

.----.---------.----.--------------.
| Id |  Data      | Id |  Data      |
:----+---------+----+-----------:
|  1 | DataA11 |  1 | DataB11 |
:----+---------+----+-----------:
|  1 | DataA12 |  1 | DataB11 |
:----+---------+----+-----------:
|  1 | DataA13 |  1 | DataB11 |
:----+---------+----+-----------:
|  2 | DataA21 |  2 | DataB21 |
:----+---------+----+-----------:
|  2 | DataA21 |  2 | DataB22 |
:----+---------+----+-----------:
|  2 | DataA21 |  2 | DataB23 |
'----'---------'----'----------------'
Left Join (or Left Outer join) on column Id will return columns from both the tables and matching records with records from left table (Null values from right table):

.----.---------.----.--------------.
| Id |  Data      | Id |  Data   |
:----+---------+----+---------:
|  1 | DataA11 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA12 |  1 | DataB11 |
:----+---------+----+---------:
|  1 | DataA13 |  1 | DataB11 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB21 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB22 |
:----+---------+----+---------:
|  2 | DataA21 |  2 | DataB23 |
:----+---------+----+---------:
|  3 | DataA31 |    |         |
'----'---------'----'---------'
Right Join (or Right Outer join) on column Id will return columns from both the tables and matching records with records from right table (Null values from left table):

+-----------------------------+
¦ Id ¦  Data   ¦ Id ¦  Data   ¦
+----+---------+----+---------¦
¦  1 ¦ DataA11 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA12 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA13 ¦  1 ¦ DataB11 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB21 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB22 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB23 ¦
¦    ¦         ¦  4 ¦ DataB41 ¦
+-----------------------------+
Full Outer Join on column Id will return columns from both the tables and matching records with records from left table (Null values from right table) and records from right table (Null values from left table):

+-----------------------------+
¦ Id ¦  Data   ¦ Id ¦  Data   ¦
¦----+---------+----+---------¦
¦  - ¦         ¦    ¦         ¦
¦  1 ¦ DataA11 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA12 ¦  1 ¦ DataB11 ¦
¦  1 ¦ DataA13 ¦  1 ¦ DataB11 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB21 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB22 ¦
¦  2 ¦ DataA21 ¦  2 ¦ DataB23 ¦
¦  3 ¦ DataA31 ¦    ¦         ¦
¦    ¦         ¦  4 ¦ DataB41 ¦
+-----------------------------+
Left Semi Join on column Id will return columns only from left table and matching records only from left table:

No comments:

Post a Comment